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@ Mask alignment and measurement of critical dimensions In integrated circuits. 

fA method for determining alignment and critical dimen- 
ns of regions formed on a semiconductor structure during 
one or more process steps includes the steps of defining a 
pattern A at a first location on the semiconductor device during 
a process step, defining a second Independent pattern B at the 
first location on the semiconductor structure during another 
process step, acquiring an Image of the combination A and B of 
both the first and second patterns, filtering that image tu 
attenuate higher spatial frequencies while preserving lower 
spatial frequencies, and comparing the sign result of the filtered 
image with the sign result of a stored image of the individual 
patterns to determine alignment errors. In the preferred 
embodiment the step of filtering includes taking the Laplaclan 
of Gaussian convolution of the image and saving the sign of the 
result. The comparison between the filtered image and the 
stored image uses the correlation function for the filtered 
images. Special circuitry is disclosed for performing the method 
rapidly enough to enable commercial applications. 
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Description 

MASK ALIGNMENT AND MEASUREMENT OF CRITICAL DIMENSIONS IN INTEGRATED CIRCUITS 



BACKGROUND OF THE INVENTION 

5 

Field of the Invention 

This invention relates to the manufacture of integrated circuits, and specifically to a system for measuring 
and controlling the alignment of various layers in such circuits, as well as to a system for measuring critical 
dimensions of integrated circuits. The system permits alignment of masks and measurement of critical 
10 dimensions with accuracy greater than the resolution of the optical system employed in performing the 
alignment and measurements. 

Description of the Prior Art 

In the manufacture of integrated circuits, a semiconductor wafer, typically silicon, is subjected to a complex 

15 series of process operations to define active and passive components by doping regions within the wafer with 
impurities. During and after these operations, layers of electrically conductive and insulating material are 
deposited and defined on the wafer to interconnect the active and passive components into a desired 
integrated circuit. The processing of the wafer usually employs techniques in which masking layers of various 
materials, such as photoresist, are deposited across the upper surface of the wafer. Using photolithographic 

20 or other techniques, openings are defined in the photoresist or masking layers to allow selective introduction 
of P- and N-conductivity type dopants, such as boron, phosphorus, arsenic and antimony, through the surface 
of the silicon wafer. These doped regions provide components, such as transistor emitters, resistors, etc., of 
the integrated circuit. In state-of-the-art integrated circuit fabrication technology, many separate masks are 
employed to define the ultimate integrated circuit. For example, some bipolar circuit fabrication processes 

25 employ 13 different masks to selectively expose photoresist layers during different processes. 

The economic revolution in electronics continues to be the result of the integrated circuit manufacturer's 
ability to place more and more components in a smaller and smaller area of the wafer. Because the cost of 
processing a single wafer is fixed, and substantially independent of the number of devices formed therein, 
decreasing the size of individual devices, increases the number of devices formed in a wafer, and results in 

30 lower cost per device. 

As the individual components on an integrated circuit become progressively smaller, however, the 
importance of aligning each mask with the underlying wafer becomes greater. For example, if the minimum 
spacing between two electrically conductive lines on an integrated circuit is 5 microns, a 1 -micron mask 
misalignment will not electrically short the lines to each other. On the other hand, a 1-micron misalignment on 

35 an integrated circuit having a minimum feature size of 1 micron will destroy the functionality of the circuit. 
Conductive lines will be shorted to each other, while transistor components will be so misplaced as to render 
the devices nonfunctional. Thus, as the integrated circuit industry's capability to place more components on a 
given size chip increases, the importance of properly aligning each overlying layer with the underlying wafer 
becomes greater. 

40 One traditional approach to aligning or checking alignment of a layer with respect to the underlying 
structure, for example, a photoresist pattern deposited on the wafer, employs comb-shaped alignment 
patterns, A first comb-shaped pattern, for example, with teeth pointed north, is fabricated on the wafer in an 
early process operation. A complementary comb-shaped pattern with teeth facing south and with a slightly 
different spacing between the teeth is formed later, for example, in a photoresist pattern applied to the wafer. 

45 The second pattern is offset from the first pattern so that the tips of the teeth of the two patterns mesh. The 
slightly different spacings of the teeth allow only one pair of opposing teeth to be aligned with each other at a 
time. The position of the aligned pair in the comb pattern provides a sensitive measure of the alignment error 
between the two layers. 

This vernier alignment pattern has proven satisfactory for many applications; however, distortion due to 
50 interference fringes in the optical images of the patterns make the line position difficult to determine. 
Furthermore, the area of comparison in the comb structure is a very small region where the teeth approach 
each other. Thus, the pattern may be employed effectively by automatic alignment measurement systems only 
if the imaging device by which the pattern is viewed has resolution as fine as the desired alignment 
measurement and has very low noise levels as well. Additionally, the inherent accuracy limit is determined by 
55 the digitizing grid used by the main circuit layout. It is desirable to overcome this limitation. 

A further deficiency of present alignment patterns is that automatic measurement with the patterns requires 
complex software for identifying the patterns, recognizing the teeth, etc. Thus, automating the alignment of 
such patterns is difficult. Of course, aligning such patterns manually is undesirably labor intensive, and subject 
to operator interpretation. Such measurements are tedious and subjective, and the operator must key the 
60 results into a terminal to control a computer integrated manufacturing system. 

Critical dimensions on a layer of an integrated circuit, as opposed to alignment of different layers, usually are 
measured by human operators. A test pattern fabricated on the circuit usually has a series of parallel bars, 
each having a width equal to the critical dimension, and each spaced apart from adjacent bars by the critical 
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dimension. Overexposure, assuming positive photoresist, results in the bars being narrower than desired, 
consequently increasing the width of the spaces between them. Underexposure had the opposite effect, 
widening the bars and narrowing the spaces. Using a microscope, the human operator measures the critical 
dimension by comparing the bar/space ratio to assure that it is within tolerances. Of course, this approach 
requires operator intervention, and is susceptible to the same difficulties In interpretation as described above, 5 
that is, distortion due to interference fringes, distortion of the optical system used to examine the test pattern, 
and extreme difficulty in automating the measurement procedure. 

In another approach, an automated system Is employed whereby the bars and spaces are viewed with a 
microscope and television monitor. Using a gating network, one or more raster scans are selected for display 
on an oscilloscope to allow determination of bar/space dimensions. Unfortunately this approach suffers from 10 
the same disadvantages described above. Furthermore, it is difficult to determine precisely which part of the 
waveform corresponds to the edge of the bar or space. 

SUMMARY OF THE INVENTION 

An improved technique is desired for aligning overlying layers on integrated circuits, checking their 15 
alignment, and measuring critical dimensions on such circuits. We have developed a system which overcomes 
the deficiencies of prior art alignment and measurement techniques for integrated circuits, and which allows 
alignment and measurement of critical dimensions to be made automatically, with minimal or no operator 
assistance. Our method employs alignment targets comprised of isolated fine scale marks fabricated at the 
feature size limit of the process employed for the integrated circuit. The marks are arranged in an irregular, 20 
typically random, pattern over a two-dimensional area of the circuit. The patterns are built with small sparse 
elements in a nonrepeating or irregular fashion, and are designed to have a large amount of low-spatial 
frequency energy. In our preferred embodiment, the patterns are constructed using a random number 
generator to determine the locations on a two-dimensional lattice having a dot. Figure 1 is an example of one 
such pattern, 25 

Once an image of the pattern is acquired, for example, using conventional optical or SEM techniques, the 
image is digitized and filtered to enhance the low frequency information in the pattern while attenuating the 
high frequency information. Destruction of detailed fine information in this manner makes our invention less 
sensitive to noise and the optical vagaries of the measurement apparatus. Correlation techniques may then be 
employed to determine alignment and/or critical dimensions of the integrated circuit. 30 

The system we have developed is insensitive to many process and sensor-related distortions which appear 
in images of small features. The technique also permits superimposition of two different patterns while 
enabling measurement of the position of each of them. Thus, if one pattern is defined on the wafer, and the 
second is placed in photoresist on top of the first, the relative alignment of the two patterns may be 
automatically determined. 35 

In one embodiment of our system, a method of determining the alignment of regions formed on a 
semiconductor structure during separate process steps includes the steps of defining a first irregular pattern 
of elements at a first location on the semiconductor structure during a first process step, defining a second 
irregular pattern of elements also at the first location on the semiconductor structure during the second 
process step, acquiring an image of both the first and second patterns at the first location to provide an image 40 
thereof, filtering the image to attenuate at least some higher spatial frequencies while preserving at least some 
lower spatial frequencies to thereby provide a filtered image, and comparing the filtered image with the stored 
image of at least one of the first or the second pattern to thereby determine the alignment of regions on the 
semiconductor structure. 

In an alternate embodiment the first and second patterns are spaced apart from each other to enable 45 
alignment by measurement of the distance between the two patterns. In a further embodiment obscuring bars 
are extended through parts of the pattern to hide edges of the individual elements, and enable determination 
of critical dimensions on the semiconductor structure. 

BRIEF DESCRIPTION OF THE DRAWINGS 50 
Figure 1 is an irregular alignment pattern of the type employed in the preferred embodiment of the 
invention; 

Figure 2 illustrates schematically two such alignment patterns in adjacent locations on a wafer, with 
Figure 2a being a top view and Figure 2b a cross-sectional view; 

Figure 3 is a filtered image of Figure 1 showing the sign result after application of a Laplacian of 55 
Gaussian convolution operator; 

Figure 4 is the correlation function of the sign array of Figure 3; 

Figure 5 is a second irregular alignment pattern of the type employed with the pattern of Figure 1 ; 
Figure 6 is a combination of the patterns of Figures 1 and 5; 

Figure 7 is the filtered image of Figure 5; 60 
Figure 8 is the filtered image of Figure 6; 

Figure 9 illustrates the correlation function of Figures 7 and 8, and of Figures 3 and 8, illustrating 
alignment of the peaks of the correlation functions; 

Figure 10 is a pattern used to measure critical dimensions on an integrated circuit; 

Figures 11a and 11b illustrate the manner by which the pattern of Figure 10 is employed in measuring 65 
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critical dimensions; 

Figure 12 is a block diagram illustrating one system for correlating two filtered images; 
Figure 13 is a block diagram illustrating one technique for collecting pixel values for computing the 
Laplacian at video rates; 

Figure 14 is a block diagram illustrating one technique for obtaining the Laplacian of the pixel values 
collected in Figure 13; 

Figure 15 is a block diagram of the basic computing element of one technique for obtaining a Gaussian 
convolution at video rates; 

Figure 16 illustrates the repeated application of the technique of Figure 15 to compute a seven by seven 
two-dimensional Gaussian convolution of the Laplacian filtered signal from Figures 14 and 16; 

Figure 17 is a block diagram illustrating the combining of the Laplacian with two seven by seven 
Gaussian elements to efficiently produce a 21 by 21 Gaussian of Laplacian convolution; and 

Figure 18 is a block diagram of the correlator used to compare filtered images. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

Our invention includes a method for measuring alignment and measuring critical dimensions on an 
integrated circuit structure as well as apparatus for carrying out that method. The method is described below, 
followed by a description of the apparatus. 

Method of Operation 

We have developed a method for measuring alignment accuracy of overlying layers on integrated circuit 
structures. Our method employs alignment targets which are built up from isolated fine scale marks, typically 
marks fabricated at the feature size limit of the process employed to manufacture the integrated circuit, (n a 
preferred embodiment, these fine marks are spread out in a pattern in a two-dimensional array to form a target. 
Typically, the target will occupy a region of the integrated circuit where circuitry is not to be formed, for 
example, between bonding pads or in a region interior to the bonding pads. 

In the preferred embodiment, our fine scale features consist of small square "dots" distributed in an array. In 
the example the array consists of a matrix of 50 by 50 potential dot locations, with each potential dot being 
separated from its neighbors by the minimum dimension, i.e., one dot width. Figure 1 is an example of such a 
matrix. The actual locations for dots in the array of 50 by 50 potential dots are randomly selected. For the 
example shown in Figure 1 , the determination of whether to place a dot in a given location is made employing a 
random number generator with a 50<Yo probability at each location. The pattern ("A") shown in Figure 1 is not 
repeating and has a large low frequency content of spatial information, that is. the dots and spaces tend to 
organize themselves into clusters. With respect to an integrated circuit, the pattern depicted might be 
fabricated by etching, oxidizing, diffusion, ion implantation, or any other well known process by which a 
detectable pattern may be formed in an integrated circuit structure. 

The particular pattern employed is not critical to our invention. Almost any pattern with sufficient low 
frequency structure may be used, including regular patterns. If regular patterns are chosen, then alignment 
can be made to a single instance of the pattern, but which instance must be otherwise specified. It is desirable 
for the pattern to contain low frequency information on two independent axes if alignment in two dimensions is 
sought. To provide relative insensitivity to optical distortion, high frequency noise and aliasing, the 
two-dimensional power spectrum of the pattern, should have significant energy distributed over at least an 
annulus centered on the origin of the power spectrum. 

In accordance with one embodiment of our method, a second pattern having small sparse elements will be 
formed in the next layer of the integrated circuit in proximity, but not overlying, the pattern shown in Figure 1 . 
For example, as shown in Figure 2a, if the pattern shown in Figure 1 is formed in the semiconductor wafer 10 
itself, then another pattern 24 may be formed in photoresist layer 20 spaced apart from the pattern in wafer 10. 
In this embodiment pattern 24 usually will be identical to pattern 12, although this is not necessary. In a manner 
explained below, alignment of pattern 24 in the photoresist with the pattern 12 in the wafer 10 may be verified 
using the method of our invention. Of course, because formation of the pattern is accomplished by the same 
process and at the same time as formation of the surrounding components of the integrated circuit, the proper 
alignment of the two patterns verifies the proper alignment of the two layers and all of the components 
fabricated therein or thereby. 

Of substantial significance to the invention, the alignment of the components to be formed using the 
photoresist 20 with the underlying wafer 10 may be verified before the the components are formed in the 
integrated circuit, that is, before the diffusion etching or other step is carried out. In this manner, if pattern B in 
the photoresist 20 has been incorrectly positioned, the layer of photoresist may be stripped from the wafer 
(using well known techniques), a new layer of photoresist applied, and a new pattern defined therein. 
Fabricating the circuit in this manner assures that alt of the regions formed in the wafer are properly aligned 
with respect to each other—eliminating mask misalignment as a source of reduced yield. 

Once the two patterns are fabricated adjacent to each other in the manner depicted in Figures 2a and 2b. 
direct measurements could be made between the two patterns to determine their relative alignment. It is 
difficult to perform such measurements with the necessary accuracy, however, as it is desired to align the two 
patterns to within substantially less than one dot diameter. We have discovered that by discarding some of the 
high frequency (detailed) information, while retaining some of the low frequency (clustering) information of 
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Figure 1, alignment of the two regions may be achieved with greater accuracy than in the prior art, and 
relatively independent of the quality of the optical system employed. 

In the preferred embodiment, images of the two targets 12 and 24 are acquired using conventional optical or 
scanning electron microscope technology. Once the image is acquired, it is digitized and a filtering operation 
is performed to enhance the low frequency structure in the pattern while attenuating the high frequency 
information. The result is a filtered image such as depicted in Figure 3. 

Although many different types of filters may be employed to enhance the low frequency structure, in the 
preferred embodiment a Laplacian of Gaussian (V 2 G) convolution operator Is used. The Gaussian Is a 
two-dimensional Gaussian which functions to low pass filter the image in a way that attenuates high spatial 
frequencies while preserving the geometric structure at lower spatial frequencies. The size of the Gaussian 
controls the scale at which structure remains in the filtered image. The Laplacian term detects locations in the 
low pass filtered image where local maxima in the rate of brightness change occur. These locations coincide 
closely with the locations where the Laplacian has zero value. It is known that: 

(V 2 G) * I e- v 2 (G*l) a G * (V 2 *l) (1) 

where V 2 is the Laplacian, G is the Gaussian and I represents the image. Hence, the order In which the 
operators are applied will not effect the result. 

Application of the Laplacian of Gaussian convolution operator and taking the sign of the result creates the 
image of Figure 3 from that of Figure 1. It will be appreciated that once the image is scanned and digitized such 
an operation may be performed using a suitably programmed conventional general purpose digital computer. 
While this is feasible, the use of conventional digital computers is undesirably slow, requiring on the order of 30 
seconds or more to process a single image. Because commercial applications of the system of our invention 
compel alignment much more quickly, we employ a processor with a special architecture to perform the 
necessary calculations. In this manner alignment may be verified in less than about 0.1 seconds. 

There are several approaches for approximating the Laplacian of Gaussian convolution of the irregular 
pattern we create. For example, a difference of 2 Gaussian convolutions where the Gausslans have space 
constants o e and at such that: 



or so, and the Gaussians are normalized to have the same volume provides a very close approximation of the 
Laplacian of Gaussian convolution. The Laplacian of Gaussian convolution may also be considered as a 
bandpass filter because the Gaussian is a low-pass operation and the Laplacian a high-pass operation. Neither 
operation has very sharp roll-off to either side of the center frequency. Any filter with properties such as above 
will be appropriate for the approach we employ. Importantly, we only use the sign of the convolved signal 
because our technique relies upon the zero crossing locations of the convolution. 

Applied to the image of Figure 1 with an operator diameter about ten times the dot separation or larger, our 
technique exhibits in its sign pattern structures correlated with the clustering structure of the original dot 
pattern and not with individual dots. Because the sign pattern is thus tied to the overall position of the dot 
pattern on the image surface, and not with any particular location for any finer feature, it is insensitive to small 
distortions such as fringe effects, interference, dirt, noise, etc., which are present in the imaging. Because this 
technique captures coarse scale structure of the patterns and is relatively insensitive to high frequency noise, 
it Is ideally suited to scanning electron microscope images of the patterns, which typically exhibit significant 
noise. 

After filtering the next step of our method is to measure displacements between the patterns 12 and 24 
(Figure 2a) using a correlation function. The correlation function of the sign of the Laplacian of Gaussian 
convolution can be estimated as follows: Let the image l(x,y) be a Gaussian random process with uniform 
spectrum and let 

C(x,y) = V 2 G * l(x,y) (3) 

where * denotes a two-dimensional convolution. The correlation of C(x t y) when the image l(x,y) is taken to be 
Gaussian white noise has the form 



where k is a constant and W Is the diameter of the negative central region of the V 2 G convoiution function. 
The correlation R s (t) of the sign of Eq. (3), S(x t y) = sgn[C(x,y)], obeys an arcsln law when C is a Gaussian 
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random process. 



« i \ 2 . -1 r 
R s (T) = ¥ Sln L 



R c (t) 



] 



(5) 




10 



15 



20 



25 



SO 



35 



40 



45 



50 



55 



60 



Figure 4 shows the correlation function of the sign cf a Lapiacian of Gaussian convolution. As shown by 
Figure 4. when two patterns such as depicted in Figure 3 are compared with each other, a single strongly 
correlated location, corresponding to the peak of the cone in Figure 4, results. Misalignment, either 
horizontally or vertically, by any significant amount, results in a much lower correlation. The width of the base of 
the cone is controlled largely by the size of the filter employed, with larger filters providing broader cones. 
Sharpness of the peak is due to the binary nature of the filtered image. The use of the sign, rather than the raw 
convolution values, also causes the correlation peak to be normalized with a height of one in the absence of 
noise. Once the height of the correlation surface has been measured at several locations around the peak, a 
model of the surface may be constructed and the peak location estimated with great accuracy. Once this 
process is performed for each of target 12 and target 24 (see Figure 2a), the separation between the peaks 
may be accurately calculated and compared with the desired value to thereby provide an indication of the 
respective alignment of the two targets. Of course, because each of the targets 12 and 24 was formed at the 
same time as other regions in that layer of the semiconductor structure, their relative alignment also provides 
information about the relative alignment of everything in the respective layers. 

Use of a second target on wafer 10 or layer 20 would allow triangulation to measure offset in a perpendicular 
orientation to Figure 2b. Furthermore, although the targets are shown as in the wafer and in photoresist on the 
wafer, such targets could be formed in any two layers if it is desired to determine their relative alignment, for 
example, in a silicon nitride masking iayer and in a second layer of metal connections. 

Unfortunately, using the above-described technique to align the regions on the semiconductor structure 
has an undesirable side effect in that it requires an imaging system with very high geometric precision because 
the two targets are placed side by side. For example, if the patterns on the two layers being checked for 
alignment are separated by 100 microns and the imaging system has a geometric accuracy of 2%, then there 
will be a ±2 micron uncertainty introduced by the imaging system. This 2 micron uncertainty may be more than 
an order of magnitude greater than the precision desired in making such measurements. Accordingly, we have 
invented a further technique for alignment and verification of alignment of regions on semiconductor 
structures. 

Figure 5 illustrates another irregular dot pattern ("B") fabricated at a desired location on a semiconductor 
structure. As with Figure 1 the individual "dots" are spaced apart by one dot height and width. By 
superimposing this second independent pattern "B" on the first pattern "A" with the dots of pattern "B° falling 
into the spaces of pattern "A," each pattern may be treated as noise to the other when the position of each is 
determined. The positions of the two patterns then may be compared to determine whether the patterns are 
correctly aligned. The pattern "A + B" shown in Figure 6 is an example of this approach. It has been created by 
combining the pattern depicted in Figure 1 with that shown in Figure 5. Each of Figures 1 and 5 have dot 
patterns which are spaced on a distance which is twice the dot diameter. In Figure 6, one of the patterns is 
shifted by one dot diameter to the left and down before being combined with the other pattern. In this manner, 
the dots of the two patterns do not overlap. Thus, with reference to the example previously described, the 
pattern of Figure 1 will have been formed on the wafer substrate, while the pattern of Figure 5, shifted by one 
dot diameter both horizontally and vertically, is formed in the overlying photoresist. When viewing the structure 
from above, both patterns will be visible. 

Figure 7 illustrates the convolution sign pattern achieved using the Lapiacian of Gaussian filter described 
above for the dot pattern of Figure 5. Figure 8 is a corresponding pattern for Figure 6. Note that the sign 
pattern of Figure 8 does not resemble either of the patterns of Figure 3 or 7. There is, however, a significant 
correlation between the sign structure in Figures 8 and 3 and also between Figures 8 and 7. This is illustrated 
by the correlation plots of Figure 9. The upper plot in Figure 9 represents the correlation between Figure 8 and 
Figure 7, while the lower plot represents the correlation between Figure 8 and Figure 3. In the manner 
previously described, the position of the peak in the individual plots may be determined by making use of as 
many measurements of the correlation surface as desired. In this manner the peak of each may be accurately 
determined. Once the individual peaks are located, the difference in peak positions for the two surfaces 
represents the alignment error between the two layers. : As with the example of Figure 2, this difference may be 
determined within a very small fraction of the diameter of a pixel. Because the two patterns, A and B, lie in the 
same part of the camera image and the correlation peaks for each of them are typically separated by less than 
a dot diameter, the geometric distortion of the optical system has a negligible effect on accuracy. 

As an example of the precision with which the peak position may be determined, the upper correlation 
surface of Figure 9 was sliced at discrete correlation values beginning with c = .2 in increments of 0.03. Each 
slice produces a disk shaped cross-section of the cone shaped correlation surface. The x positions of the 
boundary points on this disk were carefully estimated using linear interpolation. Then, the disk's center of 
mass in the X direction was calculated. Likewise the center of mass was calculated for each of the other slices 
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taken through the correlation surface. Finally the average of all the centers of mass was taken to estimate th© x : : 4 
position of the correlation peak. The same process was then used to estimate the peak position of the lower; T 
surface in Figure 9. As a result, even though the images wero made with a Vidicon camera having about a 2% * . 
geometric distortion across its field of view, the difference In peak position for horizontal alignment was onJy 
about 1/20th of a pixel. The effective pixel size here was about the same as the pattern's dot size, thus f 1 ' $ ] 
micron dot provides an alignment resolution of approximately 0.05 microns. 1 

The approach described above also can be employed to align masks prior to exposing the photoresist by 
imaging the mask pattern onto the substrate at suitably low energies (or appropriate wavelengths) to prevent 
exposure of the photoresist. Employing the same steps as described above, the alignment of the projected 
pattern and pattern present in the previous layers may be compared. Furthermore, because the photoresist is 10 
not exposed, any errors detected may be corrected by the stepper or other mask fabrication apparatus. 

Figure 10 illustrates a test pattern for measuring critical dimensions on an individual layer of an integrated 
circuit structure. The pattern of Figure 10 is fabricated by forming on a single layer of the Integrated circuit 
device a pattern of the type depicted in Figure 6 t that is, a pattern which is itself a combination of two other 
constituent patterns. The constituent patterns are designed such that obscuring bars which extend through 15 
the pattern obscure the right-hand edges of the pattern elements on the "right" pattern and the left-hand 
edges of the pattern elements on the "left" pattern. Although the pattern formed on the Integrated circuit Itself 
consists of two patterns, the pattern is formed in one operation or process step, not two. The two constituent 
patterns, however, are separately stored in the memory of our system and separately filtered. The pattern on 
the integrated circuit then may be employed to measure critical dimensions in the manner described below. SO 
The foregoing approach for measuring critical dimensions on integrated circuit structures functions because 
the pattern analysis technique ignores the regular obscuring bars, and is sensitive only to the perceived 
position of the random elements. This is because the regular bar pattern in Figure 10 has a small low spatial 
frequency content compared with that of the dot patterns, and so the resulting sign pattern of the V *G 
condition is essentially independent of the presence or absence of the regular bar pattern. 25 

The manner by which the critical dimensions may be measured is depicted in Figures 11a and 11b. As shown 
in Figure 11a, the "actual" pattern is underexposed (assuming positive photoresist) in comparison to the 
nominal "intended" pattern. Figure 11b illustrates that when the bars, which the V 2 G filter does not see, are 
removed, the centers of the individual elements having left edges obscured appear to shift to the right, while 
the pattern with right edges obscured appears to shift to the left. In comparison, if the patten. Is overexposed 30 
(again, assuming positive photoresist), the actual pattern will be smaller than the Intended pattern. In this case, 
the detected position of the pattern with left edges obscured appears to shift to the left, while the pattern With 
right edges obscured appears to shift to the right. 

Thus, to measure critical dimensions after the pattern has been formed, the image of the complete pattern 
(both left and right patterns— see, e.g., Figure 10) is acquired and filtered. Next, in a manner like that described 35 
in conjunction with Figure 9, the convolution of the stored "left" pattern is correlated with the combined 
pattern, and the convolution of the stored "right" pattern is correlated with the combined pattern. Once the 
peaks of the two correlations are determined, an offset between them may be calculated. This offset will 
provide information about the critical dimension on the Integrated circuit. For example, If the pattern is nominal 
as desired, then the "left" and "right" correlation peaks will coincide. On the other hand, If the pattern is 40 
underexposed, the correlation peak for the "right" pattern will be shifted away from the "left" pattern's peak. If 
the fabricated pattern is overexposed, then the correlation peak of the right pattern will appear shifted to the 
other direction with respect to the left pattern. The displacement between the two peaks thus will Indicate the 
critical dimension error. 

Figure 12 is a simplified flowchart illustrating a preferred embodiment of one method of our Invention. The 45 
method will be discussed in conjunction with the embodiment of Figures 6 and 9. As has been described, a 
first irregular pattern A is placed on one layer of the integrated circuit, and a second Irregular pattern B (or A In 
some embodiments) placed on another layer, either immediately above pattern A (or adjacent pattern A). An 
image of the composite pattern is then acquired and correlated with a stored model of pattern A, then with a 
stored model of pattern B. The difference between the peak positions of these two correlation surfaces 50 
represents the alignment error between the two layers. 

Figure 12 is a flowchart illustrating the overall correlation process. As shown in Figure 12, two inputs are 
supplied to the process, a stored pattern 30, and a digitized image 31 preferably acquired using an optical or 
scanning electron microscope. An optical microscope is preferred for embodiments where cost is a significant 
factor in application of the system. Adjustments in the size of the image may be made using the magnification 55 
control. These adjustments will affect the height of the peak and permit compensating for errors in the SEM 
preset. With respect to the example in which the combined pattern (A -f B) is correlated with the first pattern 
A, the stored pattern 30 will consist of digital data representative of the pattern A formed on the underlying 
semiconductor layer. The image 31 will be of the combined patterns A + B. Once the stored pattern 30 is 
retrieved, any necessary scaling 32 is applied to the pattern to convert it to the proper size for comparison with 60 
the acquired image 31. In the preferred embodiment, scaling is accomplished by using conventional Image 
scaling software. Typically this scaling operation will be done infrequently since th scaled Image can be 
stored and reused. Fine adjustments to the magnification can be made using feedback from the correlation 
hardware to determine the magnification that yields the highest correlation peak. 

After scaling step 32 is performed, the representation of the stored pattern is filtered to attenuate higher 65 
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spatial frequencies while preserving some of the lower spatial frequencies. This step is represented by block 
34. The apparatus for performing the filtering is described below. In a similar manner the acquired image is also 
filtered, as shown by step 33. The results of the filtered data are each stored, with the image being stored in 
storage 35 and the representation of the pattern A stored in storage 36. Note that storage 36 can be 

5 established once and then reused so long as the patterns and imaging geometry are not changed. Under 
external control, typically from a microprocessor 39, the contents of the two random access memories 35 and 
36 are dumped to an exclusive OR gate 37. The exclusive OR gate, 37, in effect, counts the "hits" to determine 
the correlation between the stored pattern 30 and the image 31. By supplying a different offset to one of the 
RAMs 35 or 36 than to the other, the digital representations of the images may be shifted with respect to each 

10 other for comparison. Counter 38 will count the hits for each shift, in effect, enabling the microprocessor 39 to 
measure selected points on the peaked correlation surface. Once a sufficient number of measurements have 
been made, the location of the peak can be estimated. 

In a manner analogous to that described above, the procedure of Figure 12 may be repeated, this time with 
the stored pattern representing pattern B and the scanning electron microscope image again representing the 

15 combined patterns. In this manner, the correlation surface for pattern B correlated with the combination of 
patterns A and B may be determined. Once the two peaks are determined, the alignment error is represented 
by the distance between the two peaks. 

If it is desired to measure critical dimensions, then the procedure described in conjunction with Figure 12 
will be employed, with the stored pattern on a first pass representing one of the left or right patterns, and on a 

20 second pass representing the other of the left and right patterns. In the manner explained above, the 
difference in location between the peaks will be indicative of the critical dimension sought to be measured. 

Apparatus for Carrying Out the Method 

As mentioned above, to carry out the method of our invention in a practical or commercial context, the 
25 images acquired or retrieved must be processed much more rapidly than a general purpose digital computer 
would permit. Accordingly, one of us has developed certain special purpose hardware for performing the 
filtering functions at a sufficiently high rate to enable alignment measurements to be made in less than 0.1 
seconds. 

In the case of either retrieval of a stored pattern or acquisition of an image through an optical or scanning 
30 electron microscope, a digital representation of the pattern is required. In the case of retrieval of a stored 
pattern, because the pattern is stored digitally, only scaling is required. In the case of an optical or SEM image, 
however, the image must first be digitized using, for example, a commercially available A/D converter 
operating at video rates. For the sake of explanation, assume that in the preferred embodiment, the pattern to 
be filtered consists of approximately 50 by 50 elements and is displayed in an area of approximately 500 by 500 
35 pixels. 

Once the digitized pixel pattern has been acquired, the first step of our method is to apply the Laplacian 
function. Although either the Laplacian or Gaussian functions could be applied first without affecting the result, 
one of us has determined that application of the Laplacian function first provides certain advantages. In 
particular, by applying the Laplacian function first, the video signal is centered on zero and then smoothed. 

40 This allows better use of n bit integer resolution because the n bit numbers are not required to characterize as 
wide a range of signal as would exist were the Gaussian function applied first. Applying the Laplacian first 
reduces the amountof scaling required through the Gaussian pipeline operation to keep the values in range. 

The 500-pixel square image 40 is shown at the top of Figure 13. If the Laplacian function is to be applied to 
pixel C, then the 8 bit binary value for each of pixels A-E must be retrieved and appropriately weighted. The 

45 apparatus of Figure 13 illustrates one technique for retrieving the desired pixels. As the desired pattern is 
acquired as a noninterlaced video raster scan, either from memory 44 or from microscope 42 and A-D 
converter 43, at some given instant pixel A will be supplied on line 45. At that point in time, line 46 will carry B, 
the value received 499 pixels earlier. Similarly, lines 47, 48, and 49 will hold pixels C, D, and E, respectively, 
which were received 500, 501 and 1000 pixels earlier than A. Thus, this configuration produces 5 simultaneous 

50 samples making a cross pattern off of the image as shown at the top of Figure 1 3. Note that when the next pixel 
arrives at 45, the entire cross pattern of samples will move one pixel to the right on the image following the 
raster scan. As the value for each pixel is retrieved, it may be latched before being supplied to subsequent 
processing. The delay elements 50 shown in Figure 13 may comprise any known delay element, for example, a 
shift register or a random access memory. Switches 51 and 52 control whether the pixels latched are from the 

55 retrieved pattern 44 or the acquired image 40. 

Figure 14 illustrates apparatus for obtaining the Laplacian of the acquired pixel values. One of us has 
determined that a satisfactory approximation to the Laplacian function at a given pixel location is to apply a 
weight of 4 to that particular pixel and a weight of -1 to the pixels above, below, to the left, and to the right of 
the specified pixel. As shown in Figure 14, the pixel values for pixels A and B are supplied to adder 60, while 

60 those for pixels D and E are supplied to adder 61 . As a result, adder 6D supplies an output signal A -f Bon line 
62, while adder 61 supplies an output signal D + E on line 63. Another adder 64 connected to receive the 
signals on line 62 and 63 then supplies an output signal on line 65 indicative of the sum of all of pixels A, B, D, 
and E. 

The pixel value for pixel C is supplied to a shifter 66. By shifting the pixel value two places left, the value is 
65 effectively multiplied by four, and the results supplied on line 67 to subtractor 68. Subtractor 68 combines the 
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sum supplied on line 65 with the quadruply-weighted value on line 67 to achieve a new value which 
approximates the Laplacian at pixel C of the input image. Thus the output 68 carries a video raster signal of the 
Laplacian of the input image. This signal is fed to the next stage of processing, the Gaussian convolver. 

The manner by which the Gaussian convolution is applied Is shown in Figures 15 and 16. 

In our preferred embodiment, we make use of the fact that a two-dimensional Gaussian convolution can be 
decomposed into a composition of one-dimensional Gaussian convolutions. To see this, note that the 
two-dimensional Gaussian can be written as the product of two one-dimensional Gaussians: 



G(x,y) = exp x * + y ' 



(6) 



o 

= ex P [" 1L ] ex P - II ] (7) 
= G(x)G(y) (8) 



= GW-o. G(u)I[(x-u) , (y-v)]dudv (10) 

= G'(y)*[G(x)*I(x,y)J (11) 



10 



15 



This allows us to decompose the two-dimensional convolution integral as follows: 20 
G(x,y) *I(x,y) - G (u , v) I ( (x-u) , (y-v) ] dudv (9) 



25 



30 



where l(x,y) is the input image to be convolved. 

Thus, we are able to accomplish a two-dimensional Gaussian convolution by means of two cascaded 
one-dimensionai convolutions which are much less expensive computationally to accomplish. The 
one-dimensional Gaussian operator may be approximated by a binomial distribution In one dimension. For 35 
example, the seven point binomial distribution 1, 6, 15 t 20, 15, 6, 1 is quite close to the Gaussian. In our 
preferred embodiment, we employ a three point binomial operator with weights of 1, 2, 1 three times to 
produce the effect of a convolution with the seven point binomial distribution. This choice allows a particularly 
efficient hardware implementation. This is illustrated in Figure 16. 

Figure 15 illustrates the operation of the three point mechanism, G3. A digital raster Input is applied to the 40 
input of two serially connected delay elements. These delay elements will both introduce a delay of n pixels 
between their input and output terminals, where n = 1 or 2 pixels for horizontal convo lutions and u = the 
image line length or twice the image line length for vertical convolutions. From these delay elements we 
obtain 3 simultaneous values A, B, and C separated by n pixels from each other In the image. A and C are 
applied to adder 70 and the sum supplied on line 71 to a shifter 72. Shifter 72 shifts the siim of A + C one place 45 
to the right, in effect dividing it by two. The output signal on line 73 is supplied to adder 74 In conjunction with 
the binary value for pixel B. Adder 74 thereby provides on line 75 a value equal to the sum of the value of pixel B 
plus one-haff the sum of values of pixels A and C. To maintain correct amplitude, this result Is shifted right one 
place by shifter 76, and the result supplied on line 77. The result on line 77 Is the input signal smoothed by a 3 
point binomial distribution. To obtain a finer approximation to the Gaussian, the procedure of Figure 15 may be 50 
repeated more than once as shown in Figure 16. 

Figure 16 illustrates how a pipeline of 3 three point Gaussian convolution elements Ga of Figure 15 is 
configured. This device convolves the input video stream with a 7 point binomial approximation to a 
one-dimensional Gaussian G7. If the delay elements are set to produce a delay of one pixel, this will be a 
horizontal Gaussian convolution. If the delay elements are set to the line length of the image this will be a 55 
vertical convolution. 

Convolving with a seven point normalized Gaussian operator reduces the amplitude of a typical Laplacian 
filtered image by about a factor of two, in the preferred embodiment, therefore, we need shift one less bit right 
for each seven point Gaussian operator. In other words, the amplitude of the output signal Is boosted by a 
factor of two after application of each seven point Gaussian operator. In terms of the hardware required, this 60 
allows us to gain one bit of precision with each subsequent seven point Gaussian operator. Thus, in a 
preferred system having four such operators, four bits are gained, or equivalently tour bits are saved in th 
pipeline data width while obtaining the same precision in the final output. One of th G3 operators of each G7 
operator does not include the normalization operation for the reasons discussed above. To accomplish a 7 
by 7 two-dimensional Gaussian convolution, we assemble two G7 lements, one with delay set to do a 65 
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horizontal convolution and the other with delay set for a vertical convolution. 

The lower portion ot Figure 16 illustrates using a pipeline of two G7 elements to produce a G7x7element. In 
practice, a larger Gaussian filter than the 7 by 7 operator described is required. Similar techniques as 
described above could be used to build arbitrarily large filters, however we have found a more efficient 

5 approach. After the application of the G7 x 7filter, the input signal has been low pass filtered sufficiently that a 
subsequent operator that samples only every other pixel will not suffer from aliasing problems. Thus a 14 by 14 
Gaussian convolution can be approximated by the G7x7 operator with its points spread out by increasing its 
horizontal delays from 1 to 2 pixels and its vertical delays from one line length to two line lengths. 
Figure 17 shows two G7 x 7 elements used in this way to produce effectively a 21 by 21 Gaussian convolution 

10 operator in a video pipeline with the Laplacian operator at the start. The result is a V 2 G filtered image which 
may be supplied to suitable storage means, for example, a random access memory such as described in 
conjunction with blocks 35 and 36 of Figure 12. Because only the sign of the result is saved, the filtered image 
will be a binary one, having the appearance, for example, of Figures 3, 7 or 8. 

Figure 18 is a more detailed block diagram of one embodiment of the correlator previously shown in only 

15 block form in Figure 12. Generally, the components shown in Figure 18 are driven by a 10 megaHertz pixel 
clock. On the rising edge pixel data is stored into the buffers 80a-80f , while.on the falling edge pixel data is read 
out. In the preferred embodiment the filtered image input data (from Figure 17) is supplied in parallel to six 
buffer networks 80a-80f shown in Figure 18. The contents of one of the buffers 80a is shown in greater detail; 
similar components make up each of the remaining buffers 80b-80f. Buffer 80 includes a 64k-by-1 bit memory 

20 81 for storing approximately one-quarter of the image. Memory 81 corresponds to either block designated 
storage in Figure 12. In the preferred embodiment memory 81 is a static random access memory and stores 
256-by-256 pixels of data. The choice of the size of memory 81 is arbitrary, and larger or smaller memories may 
be employed depending upon the particular application for the correlator of Figure 18. Storage of the 
convolver output data in memory 81 allows scanning of the contents of the memory for purposes of image 

25 correlation. 

Address generator counters 82 supply addressing information to the memory 81 to control the location 
where data from the convolver is stored. The address generator counters typically include a first counter for 
identifying a unique horizontal line, and a second counter for identifying a pixel position on that line. The 
addresses from which data in the static random access memory 81 are read are controlled ultimately by 

30 counter 83 and x,y register 84. X,y register 84 provides information regarding the center point of a correlation 
window on the image. By varying the position of the window's center point, one image may be shifted with 
respect to the other. Counter 83 counts the number of points in a window to be compared with the other image 
data in other buffers, while offset table 85 provides information regarding the offset within the window of each 
image point to be compared. The offset table 85 will typically contain about 4,000 points. Data from the offset 

35 table 85 and from the x,y register 84 are combined by adder 86 to provide read address information for memory 
81 . The data from the convolver are supplied both to the left buffer 80a and the right buffers 80b-80f. Typically, 
the left buffer, 80a, will be filled with convolver data from either the stored pattern or from the SEM. The right 
buffers would then be filled with convolver data from the other source. The use of six buffers allows five values 
to be generated by the exclusive OR networks 90a-90e during each cycle. The left buffer allows one value to be 

40 used for comparison, while the five right buffers allow five locations in the right image to be examined in 
parallel. These five ports onto the right image allow 5 points on the same correlation surface, Figure 4, to be 
computed in parallel. They can also be used to compute a single large correlation window by breaking it into 
five subwindows. 

The offsets are provided through a look-up table and index to each point in the correlation window. This 
45 allows comparison of any desired set of points in one image with any desired set of points in another image. 
This facility allows correcting for fixed geometric distortions between the two images. It also allows placing 
more emphasis on the center of the correlation window by spacing the samples more closely near the center 
of the image. 

Each of the exclusive OR networks 90a-90e contains a latch 91, an exclusive OR gate 92, and one or more 
50 counters depending upon the number of bits of counting desired. Latch 91 holds the output value from the left 
buffer and one of the right buffers (depending upon which exclusive OR network) and supplies the resulting 
data to the exclusive OR gate 92. XOR gate 92 drives desired counters to provide as many bits of accuracy as 
desired. In the preferred embodiment a 12-bit counter 93 is provided by three smaller 4-bit counters. Read 
enable lines coupled to the counters 93 enable supplying the output from a desired counter to a transceiver 
55 (not shown). 

In another embodiment of our invention, patterns already present in the integrated circuit masks and layers 
are employed in place of the irregular patterns especially generated. In many integrated circuits, the patterns in 
the masks have sufficient low spatial frequency energy. Furthermore, the patterns on various layers are 
somewhat independent from each other and combine in an additive manner. In such an application, a larger 
60 area of the "chip" may be employed for measuring the correlation function against a CAD data base for each 
layer. The larger area available for correlation compensates for the less-than-ideal patterns. 

Although preferred embodiments of the method and apparatus of our invention have been described above, 
these embodiments are for the purposes of illustration and not limitation. The scope of our invention may be 
determined from the appended claims. 

65 
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Claims 



1 . A method of determining alignment of regions formed on a semiconductor structure during separate 
process steps comprising : 

defining a first pattern at a first location on the semiconductor structure during a first process step and 
at a second location during a second process step; 

acquiring an image of the first patterns at the first and the second locations with apparatus to provide 10 
an image; 

filtering the image to attenuate at least some higher spatial frequencies while preserving at least some 
lower spatial frequencies to thereby provide a filtered image of the first patterns; and 

using the filtered image to determine alignment of the first pattern at each of the first and second 
locations to thereby determine alignment of regions on the semiconductor structure. 15 

2. A method as in Claim 1 wherein the step of filtering further comprises applying a two-dimensional 
Gaussian convolution. 

3. A method as in Claim 2 wherein the step of filtering further comprises applying both a Laplaclan 
operator and the Gaussian convolution to thereby provide convolution values each value having sign. 

4. A method as in Claim 3 wherein the step of using the filtered image comprises correlating the sign of 20 
the convolution values of the filtered image at each of the first and second locations. 

5. A method as in Claim 1 wherein the first location and the second location are adjacent each other, but 
in different layers. 

6. A method as in Claim 1 wherein the first pattern comprises a first plurality of elements disposed in an 
array and spaced apart from each other. 25 

7. A method of determining alignment of regions formed on a semiconductor structure during separate 
process steps comprising: 

defining a first pattern at a first location on the semiconductor structure during a process step ; 
defining a second pattern also at the first location on the semiconductor structure during another 
process step, the second pattern overlying the first pattern ; 30 
acquiring an image of both the first and second patterns at the first location; 

filtering the image to attenuate at least some higher spatial frequencies while preserving at least some 
lower spatial frequencies to thereby provide a filtered image; and 

comparing the filtered image with a stored filtered image of each of the first pattern and the second 
pattern to thereby determine alignment of regions on the semiconductor structure. 35 

8. A method as in Claim 7 wherein the step of filtering further comprises applying a Laplaclan operator. 

9. A method as in Claim 8 wherein the step of filtering further comprises applying a two-dimensional 
Gaussian convolution to thereby provide convolution values, each value having a sign. 

10. A method as in Claim 9 wherein the step of comparing the filtered image comprises correlating the 

sign of the convolution values of the filtered image with the sign of the convolution values of each of the 40 
stored filtered image of each of the first and second patterns. 

11. A method as in Claim 10 wherein the step of defining a second pattern further comprises defining the 
second pattern at the first location with the elements of the second pattern disposed between the 
elements of the first pattern. 

12. A method as in Claim 11 wherein the elements in each of the first and second patterns are disposed 45 
substantially randomly in the respective array. 

13. A method of determining a dimension on a semiconductor structure comprising: 

defining a composite pattern of elements having edges at a location on the semiconductor structure 
the pattern being itself comprised of a first pattern and a second pattern; 

defining a plurality of obscuring means extending through the composite pattern, the obscuring means 50 
obscuring a first set of edges of the elements of the first pattern and a second set of edges of the 
elements of the second pattern ; 

acquiring an image of the composite pattern and the obscuring means; 

filtering the image to attenuate at least some higher spatial frequencies while preserving at least some 
lower spatial frequencies to thereby provide a filtered image ; and 55 

comparing the filtered image of the composite pattern with a stored filtered image of the first pattern 
and with a stored filtered image of the second pattern to thereby determine any shift in the location of the 
first and second patterns which shift is Indicative of the dimension to be determined. 

14. A method as in Claim 13 wherein the first set of edges and the second set of edges comprise 
opposite edges. 60 

15. A method as in Claim 14 wherein the step of filtering further comprises applying a Laplacian operator. 

16. A method as in Claim 15 wherein the step of filtering further comprises applying a two-dimensional 
Gaussian convolution to thereby provide convolution values, each value having a sign. 

17. A method of determining alignment of regions defined on media during separate process steps 
comprising: 65 
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defining a first pattern at a first location on the media during a process step; 

defining a second pattern also at the first location on the media during another process step; 

acquiring an imag of both the first and second patterns at the first location; 

filtering the image to attenuate at least some higher spatial frequencies while preserving at least some 
lower spatial frequencies to thereby provide a filtered image ; and; 

comparing the filtered image with a stored filtered image of the first pattern to thereby determine 
alignment of the regions on the media. 
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